We aim at answering the question :
How does is the implementation of different strategies affecting the rates of COVID-19 infection ?
Through the Covid-19 pandemic, many countries have adopted mobility restriction policies, such as lockdown. Major technology companies have released mobility datasets, that describe the mobility change, relative to a baseline, across different categories of places or modes of transportation.
We observe that countries cluster based on the mobility data, revealing patterns and similarities between government measures across different countries. We build a model that predicts mobility change based on non-pharmaceutical interventions (NPIs).
Additionally, we build a bayesian model, that leverages an epidemiological compartmental model, in order to estimate the real effective reproduction number $R_t$ through time as a function of mobility data.
This opens the possibility to build an end-to-end pipeline from NPIs to $R_t$ and compartments, e.g. number of hospitalized, critical and deceased individuals, in order to generate what-if scenarios, useful to evaluate governement policies both restrospectively and prospectively.
We leverage the Coronavirus Government Response Tracker dataset from Oxford Blavatnik School of Government.
We are interested in the Closure and containment indicators, which are described in the following section. Note that when no measure is in place, the indicator is at zero.
C1 School closing
Record closings of schools and universities
C2 Workplace closing
Record closings of workplaces
C3 Cancel public events
Record cancelling public events
C4 Restrictions on gatherings
Record the cut-off size for bans on private gatherings
C5 Close public transport
Record closing of public transport
C6 Stay at home requirements
Record orders to “shelter-in-place” and otherwise confine to home
C7 Restrictions on internal movement
Record restrictions on internal movement
C8 International travel controls
Record restrictions on international travel
todo: insérer ici des viz du dataset oxford
We gathered data from Google and Apple, describing respectively 6 and 3 mobility categories.
The baseline is the median value, for the corresponding day of the week, during the 5-week period Jan 3–Feb 6, 2020.
Apple
relative volume of directions requests per country/region, sub-region or city compared to a baseline volume on 13 January 2020
todo: insérer ici des viz de mobilité
The intuition we had, working on this problem, was that mobility could be a useful proxy for measures. In order to verify this hypothesis we decided to take a few approaches.
We see some interesting strong similarities between countries. Like Italy and France that have been hit hard by Covid-19 and enforced a total lockdown. But how can we visualize these typologies better?
With this new point of view, we are able to detect countries that have indeed taken similar measures leading to similar mobility. For instance this approach shows that Japan and Sweden have similar mobility's profiles. They have both avoided strong lockdown. On the other hand Italy, France and Great Britain belong to the same cluster of drastic mobility loss. Finally we can also detect countries that have decided to handle the crisis with intermediate measures, like Denmark, Germany or Norway.Therefore it seems that this clusterization validates the use of mobility as a convenient proxy for measures.
Let $M_{t, i} \geq 0$ be here the relative change in mobility, expressed here differently to ease modelling.
In this section, we thus have $M_{t_i} = 1$ when there is no change in mobility, and $M_{t, i} = 0$ when a 100% diminution has occured (theoretic).
For a mobility category $i$, time $t$, we model $M_{t, i} = \mathbb{E}[m_{t, i}]$ with:
$$M_{t, i} \sim \mathcal{N}(m_{t, i}, \sigma)$$where the prior for $\sigma$ is $\sigma \sim \mathrm{Exponential}(1) $.
We model:
$$m_{t, i} = \exp(- \sum_k \alpha_{k, i} I_{k, t})$$with
For simplicity, we do not include the residential category, which has opposite correlation with mobility. We also remove the parks series, that is quite noisy in some countries.
We fit this model on multiple countries with the same parameters, in order to extract a country-independent effect of measures on mobility.
Model built on compartmental models. Each of the following letters represents a compartment of the population of a country:
S - susceptible
E - exposed
I - infectious
R - recovered
H - hospitalized
C - critical
D - deceased
We normalize these count numbers by the total population of the country.
$R_t$ = reproduction number at time t.
Typical 3.6* at t=0
Transition times
T_inc = average incubation period. Typical 5.6* days
T_inf = average infectious period. Typical 2.9 days
T_hosp = average time a patient is in hospital before either recovering or becoming critical. Typical 4 days
T_crit = average time a patient is in a critical state (either recover or die). Typical 14 days
Fractions
These constants are likely to be age specific (hence the subscript a):
m_a = fraction of infections that are asymptomatic or mild. Assumed 80% (i.e. 20% severe)
c_a = fraction of severe cases that turn critical. Assumed 10%
f_a = fraction of critical cases that are fatal. Assumed 30%
*Averages taken from https://www.kaggle.com/covid-19-contributions
The Conversation article:
Imperial
let $\mathrm{m}_{t, i}$ be the reduction of mobility in the category $i$, relatively to a baseline (before the pandemic).
Hence $\mathrm{m}_{t, i} > -1$.
We adopt the strong hypothesis that each mobility category presents the same transmission dynamics. Let $R_{t, i}$ be defined by the following linear relationship: $$ R_{t, i} = (1 + \mathrm{m}_{t, i})R_0 - \mathrm{m}_{t, i}R_1 $$ for any time $t$ through the pandemic, with $R_0$, $R_1$ to be estimated.
We model the effective reproduction number with a weighted mean for each mobility category:
$$R_t = \sum_i \alpha_i R_{t, i}$$with $\alpha_i$ to be estimated, where
We choose the following prior : $\forall i, \alpha_i \sim \mathrm{Gamma}(1, .5)$
We use the following parameterizations:
Let $D_t$ be the number of death at time $t$ for a given country. We model $d_t = \mathbb{E}[D_t]$
We sample $$D_t \sim \mathrm{GammaPoisson}(\psi, \frac{\psi}{d_t})$$ where $\psi \sim \mathcal{N}^+(0, 5)$
We seed the model with the following, assuming $N$ is the country population:
$R_0 \sim \mathcal{N}^+(3.28, \kappa_0)$
$R_1 \sim \mathcal{N}^+(0.7, \kappa_1)$
where $\kappa_0, \kappa_1 \sim \mathcal{N}^+(0, .5)$
$T_{inc} \sim \mathrm{Gamma}(5.6, .86)$
$T_{inf} \sim \mathrm{Gamma}(2.9, 1)$
$T_{hosp} \sim \mathrm{Gamma}(4, 1)$
$T_{crit} \sim \mathrm{Gamma}(14, 1)$
$m_a \sim \mathrm{Beta}(0.8, \phi_m)$
$c_a \sim \mathrm{Beta}(0.1, \phi_c)$
$f_a \sim \mathrm{Beta}(0.35, \phi_f)$
where $\phi_m, \phi_c, \phi_f \sim \mathrm{Gamma}(7, 2)$
We use a multi-country setting where we train the model on a pool of countries and then predict the number of deceased individuals, as well as $R_t$, on another country.
In order to do this, the model has to learn country-independent parameters, since the only varying factor in countries is the mobility.
todo
todo
NB: effectiveness of masks
We could leverage hospitalization and ICU data and feed it to the model, so that it learns based on observations of H, C and D compartments at once. This would reinforce the confidence one can have on the deceased compartment, and $R_t$, predictions, as well as enable the model to estimate other compartments more accurately. An example of dataset that could be used is the one from IHME that provides estimated numbers.
We could include other data sources as additional mobility measurements.
Here are some example questions of interest:
These questions are of utmost importance for policy makers.
Data sources